A Two-Stage Random Forest-Based Pathway Analysis Method

نویسندگان

  • Ren-Hua Chung
  • Ying-Erh Chen
چکیده

Pathway analysis provides a powerful approach for identifying the joint effect of genes grouped into biologically-based pathways on disease. Pathway analysis is also an attractive approach for a secondary analysis of genome-wide association study (GWAS) data that may still yield new results from these valuable datasets. Most of the current pathway analysis methods focused on testing the cumulative main effects of genes in a pathway. However, for complex diseases, gene-gene interactions are expected to play a critical role in disease etiology. We extended a random forest-based method for pathway analysis by incorporating a two-stage design. We used simulations to verify that the proposed method has the correct type I error rates. We also used simulations to show that the method is more powerful than the original random forest-based pathway approach and the set-based test implemented in PLINK in the presence of gene-gene interactions. Finally, we applied the method to a breast cancer GWAS dataset and a lung cancer GWAS dataset and interesting pathways were identified that have implications for breast and lung cancers.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison of Random Forest and Logistic Regression Methods in Predicting Mortality in Colorectal Cancer Patients and its Related Factors

Background and Objectives: The purpose of this study was to predict the mortality rate of colorectal cancer in Iranian patients and determine the effective factors  on the mortality of patients with colorectal cancer using random forest and logistic regression methods.   Methods: Data from 304 patients with colorectal cancer registry from the Gastroenterology and Liver Research Center of Shah...

متن کامل

Three-stage inversion improvement for forest height estimation using dual-PolInSAR data

This paper addresses an algorithm for forest height estimation using single frequency single baseline dual polarization radar interferometry data. The proposed method is based on a physical two layer volume over ground model and is represented using polarimetric synthetic aperture radar interferometry (PolInSAR) technique. The presented algorithm provides the opportunity to take advantages of t...

متن کامل

Evaluation of risk factors of recurrence of hodgkin\'s lymphoma using random survival forest and comparison with cox regression model

Background: In many studies, Cox regression was used to assess the important factors that affect the survival of cancer patients based on demographic and clinical variables. The aim of this study was to determine the factors affecting the survival of patients with Hodgkin's lymphoma using the random survival forest (RSF) method and compare it with the Cox model. Methods: In this retrospective ...

متن کامل

Scheduling and Stochastic Capacity Estimation of an EV Charging Station with PV Rooftop Using Queuing Theory and Random Forest

Power capacity of EV charging stations could be increased by installing PV arrays on their rooftops. In these charging stations, power transmission can be two-sided when needed. In this paper a new method based on queuing theory and random forest algorithm proposed to calculate net power of charging station considering random SOC of EV’s. Due to estimation time constraints, a queuing model with...

متن کامل

3D Detection of Power-Transmission Lines in Point Clouds Using Random Forest Method

Inspection of power transmission lines using classic experts based methods suffers from disadvantages such as highel level of time and money consumption. Advent of UAVs and their application in aerial data gathering help to decrease the time and cost promenantly. The purpose of this research is to present an efficient automated method for inspection of power transmission lines based on point c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2012